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(54) Tide: A FAST FREQUENCY TRANSFORMATION TECHIQUE FOR TRANSFORM AUDIO CODERS 



(57) Abstract 

A method for coding digital audio data in 
which coded Fast Modified Discrete Cosine Trans- 
form (FMDCT) coefficients are computed utilising 
a Fast Fourier Transfonn (FFT) method. The de- 
scribed method allows a significant reduction in 
computations as compared to an ordinary OCT cod- 
ing procedure. Also, pairs of audio channels can be 
combined to use a single FFT computation, where 
the selected transform length for the paired channels 
is the same. In such cases where pairing of identi- 
cal transform length channels is not possible, a long 
transfcmn length channel is combined with a short 
transform length channel and converted in two short 
transforms. A windowing function is also combined 
with a pre-processing stage to the transformation, 
further decreasing computational requireements. 



AUDIO ENCODER 



COOEUR AUDIO 



2|||tO 



12 

OMfMf 



13 



15 



1 EKmEEAtnitO 

2 BMtCoenUlKESirAIMLYSe 

a 

4 



e ANMVSeftVCHO^ttOUSTIQUE 

7 ATnuBUfiONoenTS 

e iMimssES 

0 SWTtOlEDWOSAKTS 

10 POIHTEURSD'ATnUBimONOEBrrS 

11 QIMimRCAtlOM 

12 CXPOSAMTS cooes 

13 CODAOEDWOSANTS 

14 COMPRCSSION 

16 PUIX AUDIO CODE 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCX on the front pages of pamphlets publishing international applications under the PCT. 



AL 


Albania 


ES 


Spain 


LS 


Lesotho 


SI 


Slovenia 


AM 


Aimenia 


FI 


Finland 


LT 


Uthuania 


SK 


Slovakia 


AT 


Austria 


FR 


France 


LU 


Luxembourg 


SN 


Senegal . . 


AU 


Australia 


GA 


Gabon 


LV 


Latvia 


SZ 


Swaziland 


AZ 


Azerbaijan 


GB 


United Kingdom 


MC 


Monaco 


TD 


Chad 


BA 


Bosnia and Herzegovina 


GE 


Georgia ' 


MD 


Republic of Moldova 


TG 


Togo 


BB 


Barbados 


GH 


Ghana 


MG 


Madagascar 


TJ 


Tajikistan 


BE 


Belgium 


GN 


Gumea 


MK 


Ttut former Yugoslav 


TM 


Tuikmenistan 


BF 


Buricma Faso 


GR 


Greece 




Republic of Macedonia 


TR 


"njriccy 


BG 


Bulgaria 


HU 


Hungary 


ML 


Mali 


TT 


Trinidad and Tobs^ 


BJ 


Benin 


IE 


Ireland 


MN 


Mongolia 


UA 


Ukraine 


BR 


Brazil 


IL 


Israel 


MR 


Mauritania 


UG 


Uganda 


BY 


Belarus 


IS 


Iceland 


MW 


Mahmi 


US 


United States of America 


CA 


Canada 


IT 


Italy 


MX 


Mexico 


UZ 


Uzbekistan 


CF 


Central African Republic 


JP 


Japan 


N£ 


Niger 


VN 


Viet Nam 


CO 


CMgo 


KE 


Kenya 


NL 


Netheriands 


YU 


Yugoslavia 


CH 


Switzerland 


KG 


Kyrgyzstan 


NO 


Noiway 


zw 


Zimbabwe 


CI 


Cate d*IvDtre 


KP 


Democratic People's 


NZ 


New Zealand 






CM 


Cameroon 




Republic of Korea 


PL 


Poland 






CN 


China 


KR 


Republic of Korea 


FT 


Poitqgal 






cu 


Cuba 


KZ 


Kazakstan 


RO 


Romania 






cz 


Czech RqHJblic 


LC 


Saint Lucia 


RU 


Russian Federation 






D£ 


Germany 


U 


Uechiensiein 


SD 


Sudan 






DK 


Denmark 


LK 


Sri Lanka 


SE 


Sweden 






EE 


Estonia 


LR 


Liberia 


SG 


Singapore 







wo 99/43110 



PCT/SG98/00014 



A FAST FREQUENCY TRANSFORMATION TECHNIQUE 
FOR TRANSFORM AUDIO CODERS 

Technical Field 

5 

This invention is applicable in the field of muld-channel audio coders which use modified 
discrete cosine transform as a step in the compression of audio signals. 

Background Art 

10 

In order to more efficiently broadcast or record audio signals, the amount of information 
required to represent the audio signals may be reduced. In the case of digital audio 
signals, the amount of digital informadon needed to accurately rq^roduce the original 
pulse code modulation (PCM) samples may be reduced by applying a digital compression 
IS algorithm^ resulting in a digitally compressed representation of the original signal. The 
goal of the digital conq}ression algorithm is to produce a digital representation of an audio 
signal which, when decoded and reproduced, sounds the same as the original signal, while 
using a minimum of digital information for the compressed or encoded representation. 

20 Recent advances in audio coding technology have led to high compression ratios while 
keeping audible degradation in the compressed signal to a minimum. These coders are 
intended for a variety of £q)plications, including 5.1 channel film soundtracks, HDTV, 
laser discs and multimedia. Descripuon of one ^plicable method can be found in the 
Advanced Television Systems Committee (ATSQ Standard document entided "Digital 

25 Audio Compression (ACO) Standard", Document A/52, 20 December, 1995. 

In the basic approach, at die encoder the time domain audio signal is first converted to the 
frequency domain using a bank of filters. The frequency domain coefficients, thus 
generated, are converted to fixed point representation. In fixed point syntax, each 
30 coefficient is represented as a mantissa and an exponent. The bulk of the compressed 
bitstream transmitted to the decoder comprises these exponents and mantissas. 
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The exponents are usually transmitted in their original form. However, each mantissa 
must be truncated to a fixed or variable number of decimal places. The number of bits to 
be lised for coding each mantissa is obtained from a bit allocation algorithm which may be 
based on the masking property^ of the human auditory system. Lx^wer numbers of bits 
5 result in higher compression ratios because less ^ace is required to transnut die 
coefficients. However, tiiis may cause high quantization errors, leading to audible 
distortion. A good distribution of available bits to each mantissa forms die core of the 
advanced audio coders. 

10 The frequency transformation phase has one of the greatest computation requirements in a 
transform coder. Therefore, an efficient implementation of this phase can decrease the 
computation requirement of die system significantiy and make real time operation of the 
encoder more easily attainable. 

15 In some encoders such as those specified in the AC-3 standard, the frequency domain 
transformation of signals is performed by the modified discrete cosine transform (MDCT). 
If directiy implemented, die MDCT requires 0(lf) additions and multiplications. 
However it has been found possible to reduce die number of required operations 
significantiy if die MDCT equation is able to be conq)uted in a from tiiat is amenable to 

20 the use of die well known Fast Fourier Transform (FFT) mediod of J. W. Cooley and J.W. 
Tukey (1960). Moreover, using a single FFT for two channels can result in greater 
reduction in computational requirements of the system. 

Summary of the Invention 

25 

In accordance with the present invention there is provided a method for coding audio data 
con^rising a sequence of digital audio samples, including die steps of: 

i) multiplying die input samples widi a first trigonometric function factor to 
generate an intermediate sample sequence; 
30 ii) computing a fast Fourier transform of the intermediate sample sequence to 

generate a Fourier transform coefficient sequence; 



I • 
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iii) for each transform coefficient in the sequence, multiplying the real and 
imaginary components of the transform coefficient by respective second trigonometric 
function factors, adding the multiplied real and imaginary transform coefficient 
components to generate an addition stream coefficient, and subtracting the multiplied real 

5 and imaginary transform coefficient components to generate a subtraction stream 
coefficient; 

iv) multiplying the addition and subtraction stream coefficients with respective 
third trigonometric function factors; and 

v) subtracting the corresponding multiplied addition and subtraction stream 
10 coefficients to generate audio coded frequency domain coefficients. 

The present invention also provides a method for coding audio data, including die steps 
of: 

combining first and second sequences of digital audio samples from first and 
15 second audio channels into a single complex sample sequence; 

determining a Fourier transform coefficient sequence as defined above; 

generating first and second transform coefficient sequences by combining and/or 
differencing first and second selected transform coefficients from said Fourier transfomi 
coefficient sequence; and 
20 for each of tiie first and second transform coefficient sequences, generating audio 

coded frequency domain coefficients as defined above, so as to generate respective 
sequences of said audio coded frequency domain coefficients for the first and second audio 
channels. 

25 The present invention also provides a metiiod for coding audio data including die steps of : 

obtaining at least one input sequence of digital audio samples; 

pre-processing the input sequence samples including applying a pre-multiplication 
factor to obtain modified input sequence samples; 

transforming the modified input sequence samples into a transform coefficient 
30 sequence utilising a fast Fourier transform; and 

post-processing die sequence of transform coefficients including applying first post- 
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multiplication factors to the real and imaginary coefficient components, differencing and 
combining the post-multiplied real and imaginary components, applying second post- 
multiplication factors to the difference and combination results, and differencing to obtain 
a sequence of modified discrete cosine transform coefficients representing said ii^ui 
S sequence of digital audio samples. 

The present invention also provides a method for coding audio data including the steps of: 

obtaining first and second input sequences of digital audio samples corresponding 
to respective first and second audio channels; 

10 combining the first and second input sequences of digital audio samples into a 

single complex input sample sequence; 

pre-processing the conn^lex input sequence samples including applying a pre- 
multiplication factor to obtain modified complex input sequence samples; 

transforming the modified complex input sequence san^les into a complex 

15 transform coefficient sequence utilising a fast Fourier transform; and 

post-processing the sequence of complex transform coefficients to obtain first and 
second sequences of audio coded frequency domain coefficients corresponding to the first 
and second audio channels including, for each corresponding frequency domain coefficient 
in the first and second sequences, selecting first and second complex transform coefficients 

20 from said sequence of complex transform coefficients, combining the first complex 
transform coefficient and the con[^>lex conjugate of the second complex transform 
coefficient for said first channel and differencing the first complex transform coefficient 
and the complex conjugate of the second complex transform coefficient for said second 
channel, and applying respective post-multiplication factors to the combination and 

25 difference to obtain said audio coded frequency domain coefficients corresponding to the 
first and second audio channels. 

The present invention further provides A metiiod for coding audio data including die steps 

of: 

30 obtaining first and second input sequences of digital audio samples x[n\, y[n] 

corresponding to respective first and second audio channels; 
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combining the first and second input sequences of digital audio samples into a 
single complex input sample secpience ^n], where z[/i] = x[n] + jy[n] ; 

pre-processing the complex input sequence samples including applying a pre- 
multiplication factor CQS{m/N) + jsin(mi/N) to obtain modified complex input sequence 
S samples, where N is the number of audio samples in each of the first and second input 
sequences and /i = 0 ,(^^-1); 

transforming the modified complex input sequence samples into a complex 
transform coefficient sequence Zj, utilising a fast Fourier transform, wherein k = 
0 ,(M2-l);and 

10 post-processing the sequence of con^iex transform coefficients to obtain first and 

second sequences of audio coded frequency domain coefficients corresponding to the first 
and second audio channels X^, Yj^ according to: 

= cosy * (g^cos{^tik^m)/N)-g^shli1^(k^m 

- siny * {g^in(it(;t+l/2)/iV)+g^.cos(Tc(/:+l/2)/^ 

= cosy * (g'^,cos(7c(^-^l/2)/^0-^^,sin(7t(^+l/2)/^) 

- siny * (^i^mi%(k^m)/N)^gj^cosini^^^^ 

15 where Qt is a transform coefficient sequence for die first channel; 

G\ is2L transform coefficient sequence for the second channel; 

gf^r and gt,i are the real and imaginary transform coefficient components of Gf^; 
and are the real and imaginary transform coefficient components of G'^^ 

T^ti^t is the complex conjugate of Z^.it./J 
20 y(*) = it(2Jt+l)/4. 

The modified discrete cosine transform equation can be expressed as 



1 
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H *W * cos(2iiK2w+l)*(2*-^l)/4A^ * it*(2A+I)/4) it=0..,(A//2-l) 

where x/h/ is the input sequence for a channel and N is the transform length. 

Instead of evaluating in the form given above it could be computed as 
Xj^ = cosY*(g^cos(7c(>t*l/2)/A0-^4.^in(it(*+l/2)/A0) 

Sk^^kj ^ ^(sei of real numbers) 
where G^, = g^^ + = J2 (x[n]e^'''^^)*e^^'^^ . The symbol j represents the 

lieO 

5 unaginaiy number . The expression is obtained from 

the well known FFT method, by first using transformation x'[n] =x[n] * e^'^ and then 
confuting the FFT = J2 ^^W*^^^"^ • 

For a two channel approach* a complex variable z[n] = x[n]^€^"^ + jy[n]*e^"^ is 
10 defined, where x[n] and y[n] are sample sequence for die two channels and e^'"^ 

represents the pre^multiplicadon factor. Using FFT approach, the frequency coefficient 
for the variable z[n] is computed. From Zj, the value Gj^ - (Z^t + 2r^;)/2 and C'* ^(Z^ - 
Z*w^-i)/2/, required to compute the final MDCT for each channel, respectively, is 
calculated. 

15 

If eidier or both the channels require short length transformers, two shon transforms are 
taken using the above approach. If neither need short transform, a single long transform 
is used. As an additional step in reducing computation, the windowing iunction can be 
combined with the pre-processing stage. 
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Brief Description of the Drawings 

The invention is described in detail hereinafter, by way of example only, with reference to 
preferred embodiments thereof and with aid of the acconq)anying drawings, wherein: 
5 Figure 1 is a diagrammatic representation of a stream of audio data and the 

substructure arrangement thereof; 

Figure 2 is a functional block diagram of a digital audio encoder; 

Figure 3 is a functional block diagram of a system for encoding a single audio 
channel; and 

10 Figure 4 is a functional block diagram of a system for encoding a pair of audio 

channels. 

Detailed Description of the Preferred Embodiments 

15 The above nientioned Advanced Television Systems Committee (ATSQ Standard 
document entitled "Digital Audio Compression (AC-3) Standard" (Document A/52, 20 
December, 1995) describes methods for encoding and decoding audio signals, and is 
hereby expressly incorporated herein by reference. 

20 In general, the input to an audio coder comprises a stream of digitised samples of the time 
domain analog signal. For a multi-channel encoder the stream consists of interleaved 
samples for each channel. The input stream is sectioned into blocks, each block 
containing N consecutive samples of each channel {see Fig. 1). Thus within a block the N 
samples of a channel form a sequence {x[0], x[l], x[2], ... , x[AM]}. 

25 

The time domain samples are next converted to the frequency domain using an analysis 
filter bank (see Fig. 2). The frequency domain coefficients, thus generated, form a 
coefficient set which can be identified as (Xo, X„ X^, X^^-^)^ Since the signal is real 
only the first N/2 frequency components are considered. Here is die lowest frequency 
30 (DC) component while X^^.^ is die highest frequency component of die signal. 
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Audio compression essentially entails finding how much of the information in the set (Xq, 
X,, Xffo^i) is necessary to reproduce the original analog signal at the decoder with 
minimal audible distortion. 

5 The coefficient set is nomially converted into floating point fora^t. where each coefficient 
is represented by an exponent and mantissa. The exponent set is usually transmitted in its 
original form. However, the mantissa is tmncated to a fixed or variable number of 
decimal places. The value of number of bits for coding a mantissa is usually obtained 
from a bit allocation algorithm which for advanced psychoacoustic coders may be based 
10 on the masking property of the human auditory system. A low number of bits results in 
high con^ression ratio because less space is required to transmit the coefficients. 
However this causes very high quantization error leading to audible distortion. A good 
distribution of available bits to each mantissa forms the core of the most advanced 
encoders. 

15 

In some encoders such as the AC-3, the frequency domain transformation of signals is 
performed by the (MDCT) modified cUscrete cosine transform (Eq. I). 

= E * cos(2K*(2n+l)*(2*+l)/4/^ ^ K*{2^^l)/4) *=0...(M2-l) Eq. I 

If directly implemented in the form given above, the MDCT requires O(l^) additions and 
multiplications. 

20 

Single Channel FFT 

It is possible to reduce tte number of required operations significantly if one is able to 
25 evaluate Eq. 1 using the well known Fast Fourier Transform method of J. W. Cooley and 
J. W. Tukey (1960). The general Discrete Fourier Transform (DFT) is given below (Eq. 
2). It requires O(N^) complex additions and multiplications. By using the Fast Fourier 
Transform method the DFT in Eq. 2 can be computed with 0(Mog2A/) operations only. 
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Here j is the symbol for imaginary number, i.e. y = 

Although it may not be immediately apparent how Eq. 1 can be transformed to Eq. 2, a 
careful analysis shows that this is indeed possible. To simplify Eq. 1, two functions can 
5 be defined 

ain,k) = 2n(2«+l)(2Jfc+l)/4N • Eq. 3 

Y(it) = Ti(2it+l)/4 E^"* 
Then, using these functions, Eq. 1 can be rewritten as 



= ^[«] *cos(a(/i,Ar)+Y(*)) Eq. 5 

= x[n]*(cosainJ()cos^ik)-siminJc)mym 



10 In Eq. 6 the trigonometric equality, cos(a+b) = cosa cos6-sina sin* is used for 

simplification. Furthermore, since the fimction yQc) is not dependant on variable n, it can 
be brought outside the summation expression to give 

a; = cosyW E x[n]*cos(x{n,k) - smyik) xin]*siaa(n.k) 

= ^(COSyW - TjSinYW Eq. 7 



where 7, = 4"1 *cosa(/i,Jt) and = Y. '[n] •sina(«,A:) 

if=0 "'O 



The two terms, T, and Tj, can now be evaluated separately. Using Euler's identity e'" 
15 cos6 +/sin9, we can express: 

cosa(n,)k)=(ei*'»-*' + e^'^"^)/! 
and sina(«,*)=(e»»<"-^-e^^"-'")/2;. 
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Therefore we can rewrite the term 7, as 

7*1 = E x[n]*ieJ^*e-n/2 = l/2( x[n]*eJ' + xlnj^e-'") 



where = and = ^ x[n]*e'^'^ 



Similarly 



= l/2/(^, - A^) 



Eq.8 



Eq.9 



The term A/ can thus be evaluated from Eq. 8 and Eq. 9 
A, = £ x(«]*e>« 

IfsO 

<f«0 

#i«o Eq. 10 

5 If a complex variable is defined as: 

jc'[/il=4«l*e^ Eq. 11 

then Eq. 10 is sin:q)ly: 



= e>«<*'"')*'*G, Eq. 12 



where G^. = J]) x'[ri\*e'^'"*'^ 
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The complex term = g^^+g^,,, where g^., and g^j e 31 (set of real numbers) in Eq. 12 is 
essentially the same as in Eq. 2. Therefore the FFT approach can be used to evaluate 
Gk- This brings down computation from OiN^) to 0(NlogN). Similarly, the second term 
/I, in Eq. 8 and Eq. 9 can be evaluated 



Note that G^* is actually the complex conjugate of G^ which was obtained by Eq. 12. That 
is, if Gk = gk.r+gk.i' where g^,, and g^j e 9t as defmed earlier, then G/ = g^., - ygk.i- 
Therefore G^* in Eq. 13 does not need to be conqjuted again, and the result from Eq. 12 
can be re-used. That is, only one FFT needs to be confuted for die evaluation of T^. 
10 The result of Eq. 8 to Eq. 13 is thus 





Eq. 13 



5 where G, 



Eq. 14 



Next, tiie term can be analysed 



= 1/2/(^,-^2) 

= l/2y(e^(*'''*^^ Gjfe-e->«<*^"^y^ G;) 



Eq. 15 



Finally, after simplifications of Eq. 7. 14 and IS 
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= cosyW l/2<e>«<**''2>^^ (-^^.e-M^-i^y^ G;) 

= cosy * (g^cos(ic(A:+l/2)/A0-gi;^sin()i(/r+l/2)/A0 
- siny * (^^^sin(7t(ifc-*-l/2)/A)+^j^.cos(7i(*+l/2yA^ 
= cosy * - siny * 7; £q ,5 

The term = + y^^^ is computed in 0{N[ogN) operation by use of FFT algorithms. 
The additional operation outlined in Eq. 16 to e?ctract the final Xf^ is only of order OiN). 
Therefore the MDCT can now be computed in <?fMog2^J lime. The operations required 
to obtain the MDCT are illustrated in Fig. 3. 

5 

Combining Two Channels into Single FFT 

Suppose the multi-channel encoder is required to process m audio channels. Instead of 
computing an FFT for each channel as described in the previous section, it is possible to 
10 further reduce the conq)utational requirement of the coder by combining two channels and 
using a single FFT only. In effect, instead of m FFTs only m/2 FFTS need to be 
computed. 

If the input sequence are real numbers then it is known that DFT for any two channels can 
15 be computed with only one FFT block by considering the input as a complex number. 
The real part is formed from the sequence for any one channel and the imaginary part is 
from data of another channel. After the Fourier Transform is computed for the resulting 
complex variable, the resulting transform for each channel can be easily retrieved. 

20 However, in die present case the input data to the FFT block is actually a complex number 
(formed by multiplying dtie real data by complex variable e In this case, there is no 
straightforward way of retrieving the frequency transform after having combined two 
channels. However, using some processing after the FFT one can still compute the DFT 
of two channel using a single FFT block. 

25 
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Let {jc[0]^(l]^2],...,x[N-l]} be N input san^les of the first channel and 
0'[01.y[lJ.y[2J,...y[N-l]} be the samples for the second channel. As described above, the 

frequency coefficients = xin]e^'^'^ * e^^'^ (Eq. 12 and 13) must be 
obtained for the first channel; and similarly » for the second channel 

Defining complex variable z[n] x[n]*e''^ + jy[n]*ef'"'^ Eq. 17 

and computing its DFT using the FFT method, yields 

Now subsdtuting N-k for A: in the above expression, 

/is0 

Eq. 19 

Since e'^''" = 1, n € I (the set of integers), the term e*^""" vanishes in the above expression. 
10 Taking the complex conjugate of Zj^: 

n'O Eq. 20 
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Using Eq. 18 and 20, separate expressions for and G\ are required. In a simple cai^e 
the conjugates in Eq. 18 and 20 should add and subtract to give the required expressions. 
However in this instance that is not the case. But, substituung N-k by N-k-l in Eq. 18, the 
following is obtained 

- {x[n]vy[n])-e^'^'<'''^^^ Eq. 2i 

5 Now the term e^««<>=+^>^ is common in both Eq. 17 and 19, and it is possible to isolate. 



n»0 #1-0 

= 2G, 



Similarly, 



That is 



= iZ^*Zl^.t.^y2 Ar=0..JV/2-l Eq. 22 



and 



10 



Eq. 23 
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From the expression from Eq. 22 and 23 into Eq. 16, the MDCT for each channel is 
obtained. The overall process is illustrated in Fig. 4, 



Transform Length Adjustment Technique 

5 

The frequency transform length A^is decided by the encoder based on temporal and 
spectral resolution requirements. The input signal is usually analysed with a high 
frequency bandpass filter to detect the presence of transients. This information is used to 
adjust the block length, restricting quantization noise associated with the transient within a 
10 small temporal region about the transient, avoiding temporal masking. Thus, if transient 
is detected in a channel, two short transform of length N/2 each are taken. In the absence 
of transient, a single long transform of length AT is used, thus providing higher spectral 
resolution. 

15 From the method described in die previous section for computing MDCT for two channels 
using a single FFT block, it is evident that the transform lengtii for die two paired 
channels must be the same. Therefore, pairing for the transformation phase much be such 
that channels with identical transform length are grouped together. 

20 It is however possible diat not all channels can be paired with such convenience. Assume 
that die total number of channels are an even number (if not, take a single FFT for one 
channel and the rest form an even group). Suppose out of die m channels, I need long 
transform and therefore m-l require short transform, 

25 If / is an even number, then since the total is even, it follows that l-m is also even. In this 
case, from the / channels that need long transform, 1/2 pairs are formed and for each of 
the 1/2 pairs a single FFT is computed to estimate die MDCT for die original paired 
channels. Similarly, the /-w channels are paired to form {l-m)!! pairs and for the (l-m)/! 
pms two short FFTs are computed. 

30 

Now consider die case when 1/ = 2r + 1 is an odd number. Therefore m - / - 2s + 1 is 
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also an odd number. The 2r channels requiring long transform are paired together to form 
r pairs and then 2r transforms are computed using r FFTs only. Similarly, for the 2s 
channels s pairs are formed. What remains is one channel requiring long transform and 
another requiring two short transforms. Both of these channels are paired together and 
5 two short FFTs are conq)uted to derive the MDCT. 

The rationale for constraining die long transform to two shon ones is as follows. A short 
transform is required for restricting quantization noise associated with the transient within 
a small temporal region about the transient, avoiding temporal masking. A long transform 
10 gives slight better frequency resolution but the error is not much compared to die case 
when in the presence of u-ansient a long transform is utilised. Forcing a long transform 
oiito a channel in the presence of transient leads to greater distortion in the fmal produced 
music. This conjecture was proven true by experimental studies on benchmark music 
streams. 

15 

Combining Windowing with pre-processine 

Before the time domain signal x[n] is transformed to the frequency domain, a windowing 
ilmccion is usually ^plied. Thus, if the sampled signal is p[n] then the sequence that is 
20 applied to the frequency transformation block is x[n] = p[n] * w[/x], where is the 
windowing function. From the previous sections we noted that before die FFT is 
computed for a block a pre-processing is performed as given in Eq, 1 1 (reproduced below 
for convenience). Thus 

x'[n] = x[n] * e^'^'^ 

= w[n])* (cQS7m/N + j sin m/N) 

= p[n] * {{w[n] * cos7m/N) + j(w[n]*sinm/N)) Eq. 24 

From Eq. 24 we note diat die windowing function can be combined with the cosine and 
30 sine multiplication required in Eq. U. This brings down the computation even further 
since die sine and cosine are usually implemented in a real time system as table-lookup. If 
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two rabies are constructed as defined below 

n:os[n] = w[n] * cos(m/N) 
rsm[n] = sinimi/N) 

then Eq. 1 1 can be rewritten as 

jc'W = (Pin] * rcos[/2]) + r sin [n]) Eq. 25 

10 Although the invention has been described herein primarily in terms of its mathematical 
derivation and supplication, and the procedures required for implementation, it will be 
readily recognised by those skiUed in the art that the procedures described can be 
implemented by means of any desired computational apparanis. For example, the 
invention may be embodied in computer software operating on general purpose computing 

15 equipment, or may be embodied in purpose built circuitry or contained in microcode or 
the like in an integrated circuit or set of integrated circuits. 



The foregoing detailed description of embodiments of the invention has been presented by 
way of exan^le only, and is not intended to be considered limiting to the invention as 
20 defmed in the claims s^pended hereto. 
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Glossary of Equations: 



MDCT 

= E * cos(2Jt*(2»+l)*(2**l)/4JV + n*(2A+l)/4) k=0..,(N/2-l) 

~ COSY * (g4^cos(ir(*+l/2)/iV)-g;^^ia(tc(*+l/2)/J\0 
- sinY * (gjfc^^sin(7r(Jt+l/2yiV)+g^.cos(ir(*+l/2)/iV) 
= T^cosyik) - r^sinYC*) 

x[n]*sma{n,k) 



= 1/2(4, *A^) 



<l=0 RCO 



a{n.k) = 2n(2n+l)(2Jk+l)/4N 
Y(*) = it(2ik+l)/4 
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Claims 

1 . A method for coding audio data comprising a sequence of digital audio san^les, 
including the steps of; 

5 i) multiplying the input samples with a first trigonometric function factor to 

generate an intermediate sample sequence; 

ii) computing a fast Fourier transform of the intermediate sample sequence to 
generate a Fourier transform coefficient sequence; 

iii) for each transform coefficient in die sequence, multiplying the real and 
10 imaginary components of the transform<:oefficient by respective second trigonometric 

function factors, adding the multiplied real and imaginary transform coefficient 
components to generate an addition stream coefficient, and subtracting the multiplied real 
and imaginary transform coefficient components to generate a subtraction stream 
coefficient; 

1 5 iv) multiplying the addition and subtraction stream coefficients witii respective 

third trigonometric function factors; and 

v) subtracting the corresponding multiplied addition and subtraction stream 
coefficients to generate audio coded frequency domain coefficients. 

20 2. A method for coding audio data as claimed in claim 1, wherein the audio coded 
frequency domain coefficients comprise modified discrete cosine transform coefficients. 

3, A method for coding audio data as claimed in claim 1 or 2, wherein the first 
trigonometric function factor for each audio sample is a function of die audio sample 

25 sequence position and the number of samples in die sequence. 

4. A mediod for coding audio data as claimed in claim 3, wherein the respective 
second trigonometric function factors for each transform coefficient in the sequence are 
respective functions of the transform coefficient sequence position and the number of 

30 coefficients in the sequence. 
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5, A method for coding audio data as claimed in claim 4, wherein the respective third 
trigonometric function factors are respective functions of the transform coefficient 
sequence position. 

5 6. A method for coding audio data as claimed in claim S, wherein step i) comprises 
multiplying the input sequence samples x[n] by the first trigonometric function factor 
cos(m/N) to generate the intermediate sample sequence, where: 
xfnj are the input sequence audio sanjples; 

is the number of input sequence audio samples; and 
10 n = 0, N-L 

7. A method for coding audio data as claimed m claim 6, wherein step ii) comprises 
computing the fast Fourier transform of the intermediate sample sequence so as to generate 
said transform coefficient sequence — gj^^^ + jg^i, where: 
15 Gjk is the transform coefficient sequence; 

gj^r are the real transform coefficient components; 

gj^f are the imaginary transform coefficient conqponents; and 

Jfc = 0,....,(^/2.i). 

20 8, A mediod for coding audio data as claimed b claim 7, wherein step iii) comprises 
determining the addition stream coefficients and subtraction stream coefficients 
according to: 

Ti = gi,r cos(K(k^l/2)/N) . sin(7r(k+l/2)/N) 
gk.r cos(K(k^l/2)/N) + g^i sin(7t{k^l/2)/N) 
25 where 7; arid are the subtraction stream and addition stream coefficients, respectively. 

9. A method for coding audio data as claimed in claim 8, wherein steps iv) and v) 
comprise generating the audio coded frequency domain coefficients according to: 

X, cos(n(2k^'l)/4) - 7, sin(7t(2k'¥l)/4) 
30 where are the audio coded frequency domain coefficients; and 

cos(n(2k'\^l)/4) and sin(7t(2k+l)/4) are the third trigonometric function factors. 
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10. A method for coding audio data, including the steps of: 

combining first and second sequences of digital audio samples from first and 
second audio channels into a single complex sample sequence; 

determining a Fourier transform coefficient sequence as defmed in any preceding 

S claim; 

generating first and second transform coefficient sequences by combining and/or 
differencing first and second selected transform coefficients from said Fourier transform 
coefficient sequence; and 

for each of the first and second transform coefficient sequences, generating audio 
10 coded frequency domain coefficients as defined in any preceding claim, so as to generate 
respective sequences of said audio coded frequency domain coefficients for the first and 
second audio channels. 

11. A method for coding audio data as claimed in claim 10, wherein the step of 
15 generating first and second u^ansform coefficient sequences comprises, for each 

corresponding coefficient in the first and second transform coefficient sequences, selecting 
first and second transform coefficients from said Fourier transform coefficient sequence, 
determining a complex conjugate of said second transform coefficient, combining said first 
transform coefficient and said complex conjugate for said first transform coefficient 
20 sequence and differencing said first transform coefficient and said complex conjugate for 
said second transform coefficient sequence. 

12. A method for coding audio data as claimed in claim 10 or 1 1. wherein the 
multiplying step i) comprises multiplying the input sequence samples z[n] by the first 

25 uigonometric function factor cos(nn/N) + jsin(m/N) to generate the intennediate sample 
sequence, where: 

z[n] = x[n] + jy[n] is the complex sample sequence; 

x[n\ is the first sequence of digital audio samples; 

y[n] is the second sequence of digital audio samples; 
30 N is the number of input sequence audio samples in each sequence; 

rt = 0,...,,iV-/; and 
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j is the complex constant. 

13. A method for coding audio d^ as claimed in claim 1 1 or 12, wherein said first 
and second transform coefficient sequences are generated according to: 
5 C, = (Z, + r^,,)/2 

G', = (Z,.r^J/2j 
where is said first transform coefficient sequence; 

G'jt is said second transform coefficient sequence; 
N is the number of input sequence audio samples; 

10 k = 0 ,(N/2-l); 

Zjt is said first transfonn coefficient; 

Tff4^i is the complex conjugate of said second transform coefficient; and 
j is the complex constant. 



15 14. A method for coding audio data as claimed in any one of claims 10 to 13, 

including examining said first and second sequences of digital audio samples to determine 
a short or long transform length, and coding the audio sanq)les using a short or long 
transform length as determined. 

20 IS. A method for coding audio data comprising sequences of digital audio samples 
from a plurality of audio channels, comprising determining a transform length for each of 
the channels, pairing the channels according to their deteraiined transform length, and 
coding the audio samples of first and second channels in each pair, as defined in any one 
of claims 10 to 13, according to the determined transform length. 

25 

16. A method for coding audio data as claimed in any preceding claim, including 
applying a windowing function in combination with said multiplying step i). 



17. 

30 



A method for coding audio data including the steps of: 

obtaining at least one input sequence of digital audio samples; 

pre-processing the input sequence samples including applying a pre-multiplication 
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factor to obtain modified input sequence samples; 

transforming the modified input sequence samples into a transform coefficient 
sequence utilising a fast Fourier transform;^ and 

post*processing the sequence of transform coefficients including applying first post- 
S multiplication factors to the real and imaginary coefficient components, differencing and 
combining die post-multiplied real and imaginary coirq)onents, sqsplying second post- 
multiplication factors to the difference and combination results, and differencing to obtain 
a sequence of modified discrete cosine transform coefficients representing said input 
sequence of digital audio samples. 

10 

18. A method as claimed in claim 17, wherein the pre-muldplication factor, and first 
and second post-multiplication factors are trigonometric function factors. 

19. A method as claimed in claim 18, wherein the pre-multiplication factor applied to 
IS each digital audio sample in the input sequence is a trigonometric function of the audio 

sample sequence position and the number of samples in the sequence. 

20. A method as claimed in claim 18, wherein the first post-multiplication factors for 
each transform coefficient in the sequence are trigonometric functions of the transform 

20 coefficient sequence position and the number of coefficients in the sequence. 

21. A method as ciaimied in claim 18, wherein the second post-multiplication factor for 
each difference or combination result is trigonometric functions of the transform 
coefficient sequence position of the coefficients used in the difference or combination. 

25 

22. A method as claimed in any one of claims 17 to 21, wherein the pre-processing 
operations are performed on each sample in the input sequence individually. 

23. A method as claimed in any one of claims 17 to 22, wherein the post-processing 
30 operations are performed on each transform coefficient in die sequence individually. 
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24. A method for coding audio data including the steps of: 

obtaining first and second input sequences of digital audio samples corresponding 
to respective first and second audio channels; 

combming die first and second input sequences of digital audio samples into a 
5 single complex input sample sequence; 

pre-processing the complex input sequence samples including applying a pre- 
multiplication factor to obtain modified complex input sequence samples; 

transforming die modified complex input sequence samples into a complex 
transform coefficient sequence udlising a fast Fourier transform; and 
10 post-processing die sequence of complex transform coefficients to obtain first and 

second sequences of audio coded frequency domain coefficients corresponding to the first 
and second audio channels including, for each corresponding frequency domain coefficient 
in the first and second sequences, selecting first and s^ond complex transform coefficients 
from said sequence of complex transform coefficients, combining the first con^lex 
15 transform coefficient and the complex conjugate of the second complex transforai 
coefficient for said first channel and differencing die first conq>lex transform coefficient 
and die complex conjugate of the second complex transforai coefficient for said second 
channel, and applying respective post-multiplication factors to the combination and 
difference to obtain ssdd audio coded frequency domain coefficients corresponding to the 
20 furst and second audio channels. 

25. A method as claimed in claim 24, wherein the pre-multiplication factor for each 
sample in the complex input sample sequence comprises a complex trigonometric function 
of the complex input sample sequence position and the number of samples in the sequence. 

25 

26. A mediod as claimed in claim 24 or 25, wherein die post-processing for €ach of die 
first and second channels includes sqsplying first post-multiplication factors to the real and 
imaginary coefficient components, differencing and combining the post-multiplied real and 
imaginary components, applying second post-multiplication factors to the difference and 

30 combination results, and differencing to obtain a sequence of modified discrete cosine 
transform coefficients representing said input sequence of digital audio samples. 
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27. A method for coding audio dau including the steps of: 

obtaining first and second input sequences of digital audio samples 4^1] > y[n] 

corresponding to respective first and second audio channels; 

combining the first and second ixtput sequences of digital audio san^les into a 
5 single complex input sample sequence ztn], where z[n] ^xln] +/y[/i] ; 

pre-processing the complex input sequence san^les including iq)plying a pre- 

multiplication factor cos(7m/N) ^ jsin(7m/N) to obtain modified complex input sequence 

samples, where N is the number of audio samples in each of the first and second input 

sequences and n = 0,.,..,(iV-l); 
10 transforming the modified complex input sequence samples into a complex 

transform coefficient sequence 2^^ utilising a fast Fourier transform, wherein /: - 

0,....,(iV/2-l);and 

post-processing the sequence of complex transform coefficients to obtain first and 
second sequences of audio coded frequency domain coefficients corresponding to the first 
IS and second audio channels Fjt according to: 

= (z^-z;.^.iy2y *=o..jv72-i 

X, - COSY * (^fc^cos(Tt(/r+l/2)/;V)-g^in(it(A:^l/2)/A0 

- siny * {g,^irzik^l/2)/N)^gf^cos0t(^^^^ 

= cosy * (gf^cos(nik^myi^-^tj&^^^ 

- siny * (sfj^sinin{k^l/2yN)^gj^cosi^^^^ 

where is a transform coefficient sequence for the first channel; 

G\ is a transform coefficient sequence for the second channel; 
20 gj^r and 8Ki ^ ^d imaginary transform coefficient components of G^; 

g\r and g\i are the real and imaginary transform coefficient components of G\; 

is the complex conjugate of Z^^^.^; and 
y(ife) = K(2it+l)/4. 



wo 99/431 10 



PCT/SG98/0OOI4 




wo 99/43110 



PCT/SG98/00014 



<2 
c 

I 



2/4 




Exponent 
Strategy 




>nent 
ling 











O 



5 1 



c 
o 



c 
o 

t 



sessftueuu 



§1 



C 

8 



c 

C 

a 



E 

05 

s 



^ sjaiutod 

[ uoaeooiie 

HQ 



c 

CQ: 



I s 

L. 




wo 99/43110 



3/4 



PCT/SG98/00014 



^|„jj,M, 




.1 

s 

s 

I 



CO 



wo 99/43110 



PCT/SG98/00014 



4/4 




e 



I 
J 



■J 

't' 



J 



i 

CO 



I 















1 



























t 

/\1 



si 
"I 



to t 



CO 



LL 



INTERNATIONAL SEAKCtl KJ!-ri^«.i 



Int attonol Apptlcotlon No 

PCT/SG 98/00014 



A CLASSIFICATION OF SUBJECT MATTER 

IPC 6 H04H1/00 



Accoromg to international Patent Ciassiticatlon< i PC \ or to t)oth nationa< clas&tfication and iPC 



B. FIELDS SEARCHED 



Minimum cwcumentaiion searcned (classification system toHoweo Dyciassdicaiton symbols) 
IPC 6 H04H 



Documemation searcned other than minimumdocumeniation to tne exient that suctt documems are included in the fields searcned 



Electronic data base consulted dunng the .ntemanonai search (name ot data Dase and. where practical, search terms used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ' 



Citation ot document with indication, where appropnate- ot the relevant passages 



Relevant to claim No. 



EP 0 506 111 A (MITSUBISHI ELECTRIC CORP) 
30 September 1992 

see page 2. line 1 - page 5, line 16; 
claim 1; figure 1 

EP 0 590 790 A (SONY CORP) 6 April 1994 

see page 2, line 1 - page 6, line 11; 
claims 1,8; figure 1 

US 5 181 183 A (MIYAZAKI TAKASHI) 
19 January 1993 

see column 1, line! - column 2, line 27; 
claim 1; figure 1 

-/- 



1,10,17, 
24,27 



1.10,17, 
24,27 



1.10.17, 
24,27 



13 



Further documents are listed in the continuation ol box C. 



ID 



Patera family memoers are listed in annex. 



* Special categories ol cited documents : 

•A" document detlnirtg the general state of the art which is not 

considered to be ot pamcular relevance 
"E" earlier document but published on or after the international 

tiling date 

i X" document which may throw dout>ts on pnority ciatm(s) or 
' which is cited to establish the puDiicaiiondate oi another 
citation or other special reason (as specrtiedl 
"O" document retemng to an oral disclosure, use. exhibition or 
other means 

-p" document published pnor to the international filing date but 
later than tr\e priority date claimed 



T" later document published after the international liKng date 
or prx>nty date artd not m conttict with ttie apphcaiion but 
cited to understand the pnnctpte or theory underlying the 
invention 

"X" document of pariicutar relevance; the daimed invention 
cannot be considered novel or cannot be considered to 
involve an inventive step when the document is taKen alone 

-Y* documertt of particular relevance: the claimed invention 

canrtot be considered to involve an inventive step when the 
document is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
in the art. 

document member ol the same patent famHy 



Date ot the actual completion of theiniemational search 



13 November 1998 



Date of mailing ot the international search report 



23/11/1998 



Name and mailing address ot the ISA 

European Patent Office. P.B. S618 Patentlaan 2 
NL - 2280 HV Riiswijk 
Tel. (-».3t-70) 340-2040. Tx. 31 65t epo nl. 
Fax: <+31-70) 340-3016 



Authonzed officer 



De Haan, A* J. 



Poim PCT/ISA«10 (sfloond ahMtl IJirfy 1 992) 



page 1 of 2 



In* ,oaiional ApptiCBtian No 



PCT/S6 98/00014 



1 C.<Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 


I CaiegoiY " 


Ctiatlon 01 document, with mdication.where appropnate. ot the relevant passages F 


lelevani lo eiaini No. 


A 


rp n QfiA nftO A (AMFRTCAN TELEPHONE i 
TELEGRAPH) 6 October 1993 
see page 2. line 1 - page 3, line 57; 
claim 1; figure 1 


1,10,17, 
24,27 


A 


tic c CQ9 QQA ft ^PFRRFTRA ANIBAL J ET AL) 
7 January 1997 

see column 1, line 1 - column 3, line 67; 
claim 1; figures 1.2 


1.10.17, 
24,27 


A 


EP 0 718 746 A (PHILIPS ELECTRONIQUE LAB 
: PHILIPS ELECTRONICS NV (ND) 26 June 1996 
see page 2, line 1 - page 3. line 3; claim 
1; figure 1 


1,10.17, 
24,27 



Foan PCT/ISM210 (cnmmmMn o* sacend sMMl IMr 1992) 



page 2 of 2 



lrtform*Uon on patent family m«mt>ert 



Inl ;iaUon«l Appllcaiien No 

PCT/SG 98/00014 



Patent oocument 
cited in search report 



Publication 
date 



Patent tamily 
memt)er<s) 



Publication 



EP 0506111 


A 


30-09-1&::; 


JP 
US 


4313157 A 
5249146 A 


05-11-1992 
2«-09-19y3 


EP 0590790 


A 


06-04-1994 


JP 
US 
US 


6112909 A 
5646960 A 
5640421 A 


22-04-1994 
Oo-o/-iyy/ 
17-06-1997 


US 5181183 


A 


19-01-1993 


JP 
JP 


2646778 B 
3211604 A 


27-08-1997 

1 7 Aft 1 ftft 1 

17-09-1991 


EP 0564089 


A 


06-10-1993 


CA 
JP 
US 


2090052 A 
6029859 A 
5592584 A 


03- 09-1993 

04- 02-1994 
07-01-1997 


US 5592584 


A 


07-01-1997 


CA 

pp 
cr 

JP 


2090052 A 
6029859 A 


03- 09-1993 
06-10-1993 

04- 02-1994 


EP 0718746 


A 


26-06-1996 


JP 
US 


8241187 A 
5684730 A 


17-09-1996 
04-11-1997 



